Goto

Collaborating Authors

 contextual relation


On the Expressive Power of Contextual Relations in Transformers

Fraiman, Demián

arXiv.org Machine Learning

Transformer architectures have achieved remarkable empirical success in modeling contextual relationships in natural language, yet a precise mathematical characterization of their expressive power remains incomplete. In this work, we introduce a measure-theoretic framework for contextual representations in which texts are modeled as probability measures over a semantic embedding space, and contextual relations between words, are represented as coupling measures between them. Within this setting, we introduce Sinkhorn Transformer, a transformer-like architecture. Our main result is a universal approximation theorem: any continuous coupling function between probability measures, that encodes the semantic relation coupling measure, can be uniformly approximated by a Sinkhorn Transformer with appropriate parameters.





Contextual Reasoning for Scene Generation (Technical Report)

Bozzato, Loris, Eiter, Thomas, Kiesel, Rafael, Stepanova, Daria

arXiv.org Artificial Intelligence

We present a continuation to our previous work, in which we developed the MR-CKR framework to reason with knowledge overriding across contexts organized in multi-relational hierarchies. Reasoning is realized via ASP with algebraic measures, allowing for flexible definitions of preferences. In this paper, we show how to apply our theoretical work to real autonomous-vehicle scene data. Goal of this work is to apply MR-CKR to the problem of generating challenging scenes for autonomous vehicle learning. In practice, most of the scene data for AV learning models common situations, thus it might be difficult to capture cases where a particular situation occurs (e.g. partial occlusions of a crossing pedestrian). The MR-CKR model allows for data organization exploiting the multi-dimensionality of such data (e.g., temporal and spatial). Reasoning over multiple contexts enables the verification and configuration of scenes, using the combination of different scene ontologies. We describe a framework for semantically guided data generation, based on a combination of MR-CKR and Algebraic Measures. The framework is implemented in a proof-of-concept prototype exemplifying some cases of scene generation.


Image Parsing with Stochastic Scene Grammar

Neural Information Processing Systems

In contrast to previous scene labeling work that applied discriminative classifiers to pixels (or super-pixels), we use a generative Stochastic Scene Grammar (SSG). This grammar represents the compositional structures of visual entities from scene categories, 3D foreground/background, 2D faces, to 1D lines. The grammar includes three types of production rules and two types of contextual relations. Production rules: (i) AND rules represent the decomposition of an entity into sub-parts; (ii) OR rules represent the switching among sub-types of an entity; (iii) SET rules rep- resent an ensemble of visual entities. Contextual relations: (i) Cooperative " " relations represent positive links between binding entities, such as hinged faces of a object or aligned boxes; (ii) Competitive "-" relations represents negative links between competing entities, such as mutually exclusive boxes. We design an efficient MCMC inference algorithm, namely Hierarchical cluster sampling, to search in the large solution space of scene configurations. The algorithm has two stages: (i) Clustering: It forms all possible higher-level structures (clusters) from lower-level entities by production rules and contextual relations.


Reasoning on Multi-Relational Contextual Hierarchies via Answer Set Programming with Algebraic Measures

Bozzato, Loris, Eiter, Thomas, Kiesel, Rafael

arXiv.org Artificial Intelligence

Dealing with context dependent knowledge has led to different formalizations of the notion of context. Among them is the Contextualized Knowledge Repository (CKR) framework, which is rooted in description logics but links on the reasoning side strongly to logic programs and Answer Set Programming (ASP) in particular. The CKR framework caters for reasoning with defeasible axioms and exceptions in contexts, which was extended to knowledge inheritance across contexts in a coverage (specificity) hierarchy. However, the approach supports only this single type of contextual relation and the reasoning procedures work only for restricted hierarchies, due to non-trivial issues with model preference under exceptions. In this paper, we overcome these limitations and present a generalization of CKR hierarchies to multiple contextual relations, along with their interpretation of defeasible axioms and preference. To support reasoning, we use ASP with algebraic measures, which is a recent extension of ASP with weighted formulas over semirings that allows one to associate quantities with interpretations depending on the truth values of propositional atoms. Notably, we show that for a relevant fragment of CKR hierarchies with multiple contextual relations, query answering can be realized with the popular asprin framework. The algebraic measures approach is more powerful and enables e.g.


Image Parsing with Stochastic Scene Grammar

Zhao, Yibiao, Zhu, Song-chun

Neural Information Processing Systems

In contrast to previous scene labeling work that applied discriminative classifiers to pixels (or super-pixels), we use a generative Stochastic Scene Grammar (SSG). This grammar represents the compositional structures of visual entities from scene categories, 3D foreground/background, 2D faces, to 1D lines. The grammar includes three types of production rules and two types of contextual relations. Production rules: (i) AND rules represent the decomposition of an entity into sub-parts; (ii) OR rules represent the switching among sub-types of an entity; (iii) SET rules rep- resent an ensemble of visual entities. Contextual relations: (i) Cooperative " " relations represent positive links between binding entities, such as hinged faces of a object or aligned boxes; (ii) Competitive "-" relations represents negative links between competing entities, such as mutually exclusive boxes. We design an efficient MCMC inference algorithm, namely Hierarchical cluster sampling, to search in the large solution space of scene configurations. The algorithm has two stages: (i) Clustering: It forms all possible higher-level structures (clusters) from lower-level entities by production rules and contextual relations.


Hyperspectral Image Classification With Context-Aware Dynamic Graph Convolutional Network

Wan, Sheng, Gong, Chen, Zhong, Ping, Pan, Shirui, Li, Guangyu, Yang, Jian

arXiv.org Machine Learning

In hyperspectral image (HSI) classification, spatial context has demonstrated its significance in achieving promising performance. However, conventional spatial context-based methods simply assume that spatially neighboring pixels should correspond to the same land-cover class, so they often fail to correctly discover the contextual relations among pixels in complex situations, and thus leading to imperfect classification results on some irregular or inhomogeneous regions such as class boundaries. To address this deficiency, we develop a new HSI classification method based on the recently proposed Graph Convolutional Network (GCN), as it can flexibly encode the relations among arbitrarily structured non-Euclidean data. Different from traditional GCN, there are two novel strategies adopted by our method to further exploit the contextual relations for accurate HSI classification. First, since the receptive field of traditional GCN is often limited to fairly small neighborhood, we proposed to capture long range contextual relations in HSI by performing successive graph convolutions on a learned region-induced graph which is transformed from the original 2D image grids. Second, we refine the graph edge weight and the connective relationships among image regions by learning the improved adjacency matrix and the 'edge filter', so that the graph can be gradually refined to adapt to the representations generated by each graph convolutional layer. Such updated graph will in turn result in accurate region representations, and vice versa. The experiments carried out on three real-world benchmark datasets demonstrate that the proposed method yields significant improvement in the classification performance when compared with some state-of-the-art approaches.


Image Parsing with Stochastic Scene Grammar

Zhao, Yibiao, Zhu, Song-chun

Neural Information Processing Systems

This paper proposes a parsing algorithm for scene understanding which includes four aspects: computing 3D scene layout, detecting 3D objects (e.g. furniture), detecting 2D faces (windows, doors etc.), and segmenting background. In contrast to previous scene labeling work that applied discriminative classifiers to pixels (or super-pixels), we use a generative Stochastic Scene Grammar (SSG). This grammar represents the compositional structures of visual entities from scene categories, 3D foreground/background, 2D faces, to 1D lines. The grammar includes three types of production rules and two types of contextual relations. Production rules: (i) AND rules represent the decomposition of an entity into sub-parts; (ii) OR rules represent the switching among sub-types of an entity; (iii) SET rules rep- resent an ensemble of visual entities. Contextual relations: (i) Cooperative “+” relations represent positive links between binding entities, such as hinged faces of a object or aligned boxes; (ii) Competitive “-” relations represents negative links between competing entities, such as mutually exclusive boxes. We design an efficient MCMC inference algorithm, namely Hierarchical cluster sampling, to search in the large solution space of scene configurations. The algorithm has two stages: (i) Clustering: It forms all possible higher-level structures (clusters) from lower-level entities by production rules and contextual relations. (ii) Sampling: It jumps between alternative structures (clusters) in each layer of the hierarchy to find the most probable configuration (represented by a parse tree). In our experiment, we demonstrate the superiority of our algorithm over existing methods on public dataset. In addition, our approach achieves richer structures in the parse tree.